Mining Coreference Relations between Formulas and Text using Wikipedia

نویسندگان

  • Minh Nghiem Quoc
  • Keisuke Yokoi
  • Yuichiroh Matsubayashi
  • Akiko Aizawa
چکیده

In this paper, we address the problem of discovering coreference relations between formulas and the surrounding text. The task is different from traditional coreference resolution because of the unique structure of the formulas. In this paper, we present an approach, which we call ‘CDF (Concept Description Formula)’, for mining coreference relations between formulas and the concepts that refer to them. Using Wikipedia articles as a target corpus, our approach is based on surface level text matching between formulas and text, as well as patterns that represent relationships between them. The results showed the potential of our approach for formulas and text coreference mining.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Subtree Mining for Relation Extraction from Wikipedia

In this study, we address the problem of extracting relations between entities fromWikipedia’s English articles. Our proposed method first anchors the appearance of entities in Wikipedia’s articles using neither Named Entity Recognizer (NER) nor coreference resolution tool. It then classifies the relationships between entity pairs using SVM with features extracted from the web structure and sub...

متن کامل

WikiCoref: An English Coreference-annotated Corpus of Wikipedia Articles

This paper presents WikiCoref, an English corpus annotated for anaphoric relations, where all documents are from the English version of Wikipedia. Our annotation scheme follows the one of OntoNotes with a few disparities. We annotated each markable with coreference type, mention type and the equivalent Freebase topic. Since most similar annotation efforts concentrate on very specific types of w...

متن کامل

Relation Extraction from Wikipedia Using Subtree Mining

The exponential growth and reliability of Wikipedia have made it a promising data source for intelligent systems. The first challenge of Wikipedia is to make the encyclopedia machine-processable. In this study, we address the problem of extracting relations among entities from Wikipedia’s English articles, which in turn can serve for intelligent systems to satisfy users’ information needs. Our ...

متن کامل

Entity Extraction: From Unstructured Text to DBpedia RDF Triples

In this paper, we describe an end-to-end system that automatically extracts RDF triples describing entity relations and properties from unstructured text. This system is based on a pipeline of text processing modules that includes a semantic parser and a coreference solver. By using coreference chains, we group entity actions and properties described in different sentences and convert them into...

متن کامل

Using Wikitology for Cross-Document Entity Coreference Resolution

We describe the use of the Wikitology knowledge base as a resource for a variety of applications with special focus on a cross-document entity coreference resolution task. This task involves recognizing when entities and relations mentioned in different documents refer to the same object or relation in the world. Wikitology is a knowledge base system constructed with material from Wikipedia, DB...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010